New Methods for Voice Conversion
نویسندگان
چکیده
iii ACKNOWLEDGEMENTS I would like to thank to my thesis supervisor Assoc. Prof. Levent M. Arslan for his guidance and support. It was a great pleasure to work with him during this thesis and in all the projects we have been involved in the last three years. I would like to thank Prof. reading my thesis and participating in my thesis committee. I would like to thank to my family and to Aylin for their endless love, encouragement, and support. I would like to express my gratitude to Barış Bozkurt for his inspiring ideas. Special thanks go to my colleagues at Sestek Inc. And to everyone who has participated in the subjective tests: Thanks for your patience and time. I promise, the tests will take shorter next time. iv ABSTRACT NEW METHODS FOR VOICE CONVERSION This study focuses on various aspects of voice conversion and investigates new methods for implementing robust voice conversion systems that provide high quality output. The relevance of several spectral and temporal characteristics for perception of speaker identity is investigated using subjective tests. These characteristics include the subband based spectral content, vocal tract, pitch, duration, and energy. Two new methods based on Wavelet Transform and selective preemphasis are described for transformation of the vocal tract spectrum. A new speaker specific intonational model is developed and evaluated both in terms of accuracy and voice conversion performance. A voice conversion database in Turkish is collected and employed for the evaluation of the new methods.
منابع مشابه
Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملطراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کاملCross - Lingual Voice Conversion
CROSS-LINGUAL VOICE CONVERSION Cross-lingual voice conversion refers to the automatic transformation of a source speaker’s voice to a target speaker’s voice in a language that the target speaker can not speak. It involves a set of statistical analysis, pattern recognition, machine learning, and signal processing techniques. This study focuses on the problems related to cross-lingual voice conve...
متن کاملVoice conversion methods for vocal tract and pitch contour modification
This study proposes two new methods for detailed modeling and transformation of the vocal tract spectrum and the pitch contour. The first method (selective pre-emphasis) relies on band-pass filtering to perform vocal tract transformation. The second method (segmental pitch contour model) focuses on a more detailed modeling of pitch contours. Both methods are utilized in the design of a voice co...
متن کاملFrame alignment method for cross-lingual voice conversion
Most of the existing voice conversion methods calculate the optimal transformation function from a given set of paired acoustic vectors of the source and target speakers. The alignment of the phonetically equivalent source and target frames is problematic when the training corpus available is not parallel, although this is the most realistic situation. The alignment task is even more difficult ...
متن کامل